Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AST-level source position tracking. #2076

Merged
merged 9 commits into from
Aug 14, 2024

Conversation

sauclovian-g
Copy link
Contributor

This is the first chunk of work on #2071.

Track positions explicitly in every expression and pattern constructor (but not types, yet); kill off the Eq instances this renders problematic (or were already problematic). Kill off the Eq instance for Type too and clean up the things that were trying to use it.

@sauclovian-g sauclovian-g added error-messages usability An issue that impedes efficient understanding and use tech-debt error-handling labels Jul 18, 2024
@sauclovian-g sauclovian-g self-assigned this Jul 18, 2024
@sauclovian-g
Copy link
Contributor Author

I would appreciate fairly close attention to 10dbb25, because even though it's fairly simple it's the kind of thing where if you botch it up you'll still be finding problems three years later. (Recommend looking at that commit by itself for this.)

src/SAWScript/MGU.hs Outdated Show resolved Hide resolved
src/SAWScript/Parser.y Outdated Show resolved Hide resolved
src/SAWScript/Parser.y Outdated Show resolved Hide resolved
src/SAWScript/Position.hs Outdated Show resolved Hide resolved
src/SAWScript/Position.hs Outdated Show resolved Hide resolved
src/SAWScript/AST.hs Outdated Show resolved Hide resolved
src/SAWScript/Interpreter.hs Show resolved Hide resolved
Copy link
Contributor

@RyanGlScott RyanGlScott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work. One last request: did you verify that these changes improve the source locations in at least one program? If so, can you check it in as a test case?

@sauclovian-g
Copy link
Contributor Author

The case in question is: let foo () : String = 3;, which in the new regime reports the location of 3 and in the old world reports the whole span from foo to 3.

Sticking it in the test suite raises some other questions though (mostly about how to prepare for #2075); not opposed to doing it but those questions probably ought to get answered first. See other message elsewhere...

@RyanGlScott
Copy link
Contributor

I propose one way to handle adding test cases of this sort in #2075 (comment).

Copy link
Contributor

@RyanGlScott RyanGlScott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Do make sure to rename the type_errors subdirectory to something like test_type_errors so that SAW's test suite driver picks it up (see here).

Comment on lines +105 to +114
# Prune the timestamps from the log since they'll never match.
# We also need to shorten pathnames that contain the current
# directory, because saw feels the need to expand pathnames
# (see also saw-script #2082).
sed < $TEST.rawlog > $TEST.log '
/^\[[0-9][0-9]:[0-9][0-9]:[0-9][0-9]\.[0-9][0-9][0-9]\] /{
s/^..............//
}
s,'"$CURDIR"'/,,
'
Copy link
Contributor

@RyanGlScott RyanGlScott Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we'd introduce a settings into SAW that:

  • Omits timestamps
  • Shorten path names

So that we don't have to strip them out with sed afterwards. Can you file issues about these? (No need to hold up this PR, but I want to make sure the ideas are recorded.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #2090; #2082 is also relevant to the pathnames issue.

Add positions to all expression and pattern nodes. Kill off the LExpr
and LPattern expression and pattern cases that held just a position.
The idea had been to always use the first enclosing such position.
This doesn't provide accurate enough positions for type errors (in
general for a type error you always want the exact expression you're
complaining about, especially if you're printing spans of positions
and not just line numbers), and, worse, with this approach it's hard
to tell if any given position is the one you actually ought to be
using.

Fix up some of the position usage where prompted by the ensuing typing
changes.

Kill off the currentExprPos field of the local state monad in MGU.hs,
which was for collecting the enclosing LExpr and is no longer needed.

In pursuit of all this, add two ops to Position.hs:
   - leadingPos, which gets the empty (no span) position at the beginning
     of the position of something else, which can be used for deriving the
     position of implicit elements;
   - maxSpan', which is like maxSpan but works for (exactly) two Positioned
     objects of potentially heterogeneous type. This allows dropping a
     number of explicit calls to getPos in places where that's verbose and
     ugly.
These contain source positions, so derived Eq instances aren't in
general going to produce the desired behavior.

Leave the Eq instance on Type for the moment because it's used in the
typechecker. This is going to require further attention.
There were only two uses of it; provide matching-based alternatives
for both, on the grounds that if we don't need an explicit Eq instance
that specifically ignores position information it's better not to have
to write one.

The use in MGU handled peeling off any LType nodes; the one in
Interpreter did not and was therefore potentially wrong...
We concluded after discussion that using the more precise position
from the pattern involved is preferable to using PosREPL, even though
the name "it" is predicated on this coming from the repl.
Also add a bunch of infrastructure -- the driver script for this is
massively overkill for just this test but we expect this test to grow
a lot of siblings in the not-too-distant future.
@sauclovian-g
Copy link
Contributor Author

Force-push was to rebase on master -- the only things in master since this branch started were submodule bumps, which are extremely unlikely to interact with the things I've been doing.

(merged with master first so the force-push comparison gh offers will come out identical)

@sauclovian-g
Copy link
Contributor Author

(careful examination will also reveal that I squashed the "rename test dir" commit; there's no point preserving that glitch, plus git's shortcomings with rename handlings make it better to avoid having any when the opportunity arises)

@sauclovian-g sauclovian-g merged commit e687666 into master Aug 14, 2024
10 checks passed
@sauclovian-g sauclovian-g deleted the 2071-dholland-positioning branch August 14, 2024 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error-handling error-messages tech-debt usability An issue that impedes efficient understanding and use
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants